HashMap 的实现原理

HashMap是Java集合框架中常用的哈希表实现，用于存储键值对。它提供了高效的插入、删除和查找操作。以下是HashMap的实现原理及其主要特点：

# 1. 数据结构

HashMap 主要基于数组和链表（在 Java 8 之后，还引入了红黑树）的混合数据结构。其内部结构可以简化为：

数组：存储 Node 对象（键值对）。
链表：用于处理哈希冲突，当多个键具有相同的哈希值时，它们会被链接在同一个链表中。
红黑树：当链表中的元素数量超过一定阈值（默认是 8）时，链表会转换成红黑树，以提高查找效率。

# 2. 主要参数

初始容量（initial capacity）：HashMap在创建时可以指定的初始数组大小，默认值是 16。
负载因子（load factor）：用来衡量 HashMap 满的程度，默认值是 0.75。当实际大小超过容量与负载因子的乘积时，HashMap 会进行扩容。

# 3. 哈希函数

HashMap 使用键的 hashCode() 方法计算哈希值，并通过二次哈希扰动函数减少冲突。具体实现是：

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

这个函数通过对高位和低位的混合（异或操作），使得哈希码的分布更加均匀，减少冲突。

# 4. 存储与取值

插入元素（put 操作）

计算哈希值：通过键的 hashCode() 方法和哈希函数计算哈希值。
计算索引：通过哈希值与数组长度取模（(n - 1) & hash）确定元素在数组中的位置。
处理冲突：如果计算出的索引位置已经有元素：
- 使用链表：在该位置创建一个链表，将新元素添加到链表末尾。
- 使用红黑树：如果链表长度超过阈值，将链表转换为红黑树，并将新元素添加到树中。
插入元素：将新元素插入到计算出的索引位置。

取值（get 操作）

计算哈希值：通过键的 hashCode() 方法和哈希函数计算哈希值。
计算索引：通过哈希值与数组长度取模（(n - 1) & hash）确定元素在数组中的位置。
查找元素：在计算出的索引位置查找元素：
- 如果该位置是链表，遍历链表查找匹配的键。
- 如果该位置是红黑树，遍历树查找匹配的键。

# 5. 扩容机制

当 HashMap 中的元素数量超过容量与负载因子的乘积时，会进行扩容操作：

扩容：将数组容量扩大为原来的两倍。
重新哈希：将原数组中的所有元素重新计算哈希值，并将它们移动到新数组中。

# 6. 线程安全性

HashMap 不是线程安全的。如果需要在多线程环境中使用，可以使用Collections.synchronizedMap 方法将其包装成同步的 Map，或者使用 ConcurrentHashMap 来提供更高效的并发支持。

示例代码以下是一个简化的 HashMap 的实现示例：

public class SimpleHashMap<K, V> {
    static class Node<K, V> {
        final int hash;
        final K key;
        V value;
        Node<K, V> next;

        Node(int hash, K key, V value, Node<K, V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
    }

    private Node<K, V>[] table;
    private int size;
    private int capacity = 16;
    private float loadFactor = 0.75f;
    private int threshold;

    public SimpleHashMap() {
        table = new Node[capacity];
        threshold = (int) (capacity * loadFactor);
    }

    private int hash(Object key) {
        int h = key.hashCode();
        return (h) ^ (h >>> 16);
    }

    public void put(K key, V value) {
        int hash = hash(key);
        int index = (capacity - 1) & hash;
        Node<K, V> newNode = new Node<>(hash, key, value, null);

        if (table[index] == null) {
            table[index] = newNode;
        } else {
            Node<K, V> current = table[index];
            while (true) {
                if (current.key.equals(key)) {
                    current.value = value;
                    return;
                }
                if (current.next == null) {
                    current.next = newNode;
                    break;
                }
                current = current.next;
            }
        }
        size++;
        if (size > threshold) {
            resize();
        }
    }

    public V get(Object key) {
        int hash = hash(key);
        int index = (capacity - 1) & hash;
        Node<K, V> current = table[index];

        while (current != null) {
            if (current.key.equals(key)) {
                return current.value;
            }
            current = current.next;
        }
        return null;
    }

    private void resize() {
        capacity *= 2;
        threshold = (int) (capacity * loadFactor);
        Node<K, V>[] newTable = new Node[capacity];

        for (int i = 0; i < table.length; i++) {
            Node<K, V> current = table[i];
            while (current != null) {
                int newIndex = (capacity - 1) & current.hash;
                Node<K, V> next = current.next;
                current.next = newTable[newIndex];
                newTable[newIndex] = current;
                current = next;
            }
        }
        table = newTable;
    }
}

# 总结

HashMap 通过哈希表实现键值对的存储，使用链表和红黑树处理哈希冲突，并通过扩容机制保持性能。它提供了高效的插入、删除和查找操作，但在多线程环境中需要额外的同步措施以保证线程安全。

江离书生

Choose mode